224 research outputs found

    Non-oblivious Strategy Improvement

    Full text link
    We study strategy improvement algorithms for mean-payoff and parity games. We describe a structural property of these games, and we show that these structures can affect the behaviour of strategy improvement. We show how awareness of these structures can be used to accelerate strategy improvement algorithms. We call our algorithms non-oblivious because they remember properties of the game that they have discovered in previous iterations. We show that non-oblivious strategy improvement algorithms perform well on examples that are known to be hard for oblivious strategy improvement. Hence, we argue that previous strategy improvement algorithms fail because they ignore the structural properties of the game that they are solving

    The Complexity of All-switches Strategy Improvement

    Get PDF
    Strategy improvement is a widely-used and well-studied class of algorithms for solving graph-based infinite games. These algorithms are parameterized by a switching rule, and one of the most natural rules is "all switches" which switches as many edges as possible in each iteration. Continuing a recent line of work, we study all-switches strategy improvement from the perspective of computational complexity. We consider two natural decision problems, both of which have as input a game GG, a starting strategy ss, and an edge ee. The problems are: 1.) The edge switch problem, namely, is the edge ee ever switched by all-switches strategy improvement when it is started from ss on game GG? 2.) The optimal strategy problem, namely, is the edge ee used in the final strategy that is found by strategy improvement when it is started from ss on game GG? We show PSPACE\mathtt{PSPACE}-completeness of the edge switch problem and optimal strategy problem for the following settings: Parity games with the discrete strategy improvement algorithm of V\"oge and Jurdzi\'nski; mean-payoff games with the gain-bias algorithm [14,37]; and discounted-payoff games and simple stochastic games with their standard strategy improvement algorithms. We also show PSPACE\mathtt{PSPACE}-completeness of an analogous problem to edge switch for the bottom-antipodal algorithm for finding the sink of an Acyclic Unique Sink Orientation on a cube

    Time and Parallelizability Results for Parity Games with Bounded Tree and DAG Width

    Full text link
    Parity games are a much researched class of games in NP intersect CoNP that are not known to be in P. Consequently, researchers have considered specialised algorithms for the case where certain graph parameters are small. In this paper, we study parity games on graphs with bounded treewidth, and graphs with bounded DAG width. We show that parity games with bounded DAG width can be solved in O(n^(k+3) k^(k + 2) (d + 1)^(3k + 2)) time, where n, k, and d are the size, treewidth, and number of priorities in the parity game. This is an improvement over the previous best algorithm, given by Berwanger et al., which runs in n^O(k^2) time. We also show that, if a tree decomposition is provided, then parity games with bounded treewidth can be solved in O(n k^(k + 5) (d + 1)^(3k + 5)) time. This improves over previous best algorithm, given by Obdrzalek, which runs in O(n d^(2(k+1)^2)) time. Our techniques can also be adapted to show that the problem of solving parity games with bounded treewidth lies in the complexity class NC^2, which is the class of problems that can be efficiently parallelized. This is in stark contrast to the general parity game problem, which is known to be P-hard, and thus unlikely to be contained in NC

    Bounded Satisfiability for PCTL

    Get PDF
    While model checking PCTL for Markov chains is decidable in polynomial-time, the decidability of PCTL satisfiability, as well as its finite model property, are long standing open problems. While general satisfiability is an intriguing challenge from a purely theoretical point of view, we argue that general solutions would not be of interest to practitioners: such solutions could be too big to be implementable or even infinite. Inspired by bounded synthesis techniques, we turn to the more applied problem of seeking models of a bounded size: we restrict our search to implementable -- and therefore reasonably simple -- models. We propose a procedure to decide whether or not a given PCTL formula has an implementable model by reducing it to an SMT problem. We have implemented our techniques and found that they can be applied to the practical problem of sanity checking -- a procedure that allows a system designer to check whether their formula has an unexpectedly small model

    Computing Approximate Nash Equilibria in Polymatrix Games

    Full text link
    In an ϵ\epsilon-Nash equilibrium, a player can gain at most ϵ\epsilon by unilaterally changing his behaviour. For two-player (bimatrix) games with payoffs in [0,1][0,1], the best-knownϵ\epsilon achievable in polynomial time is 0.3393. In general, for nn-player games an ϵ\epsilon-Nash equilibrium can be computed in polynomial time for an ϵ\epsilon that is an increasing function of nn but does not depend on the number of strategies of the players. For three-player and four-player games the corresponding values of ϵ\epsilon are 0.6022 and 0.7153, respectively. Polymatrix games are a restriction of general nn-player games where a player's payoff is the sum of payoffs from a number of bimatrix games. There exists a very small but constant ϵ\epsilon such that computing an ϵ\epsilon-Nash equilibrium of a polymatrix game is \PPAD-hard. Our main result is that a (0.5+δ)(0.5+\delta)-Nash equilibrium of an nn-player polymatrix game can be computed in time polynomial in the input size and 1δ\frac{1}{\delta}. Inspired by the algorithm of Tsaknakis and Spirakis, our algorithm uses gradient descent on the maximum regret of the players. We also show that this algorithm can be applied to efficiently find a (0.5+δ)(0.5+\delta)-Nash equilibrium in a two-player Bayesian game

    Strategy iteration algorithms for games and Markov decision processes

    Get PDF
    In this thesis, we consider the problem of solving two player infinite games, such as parity games, mean-payoff games, and discounted games, the problem of solving Markov decision processes. We study a specific type of algorithm for solving these problems that we call strategy iteration algorithms. Strategy improvement algorithms are an example of a type of algorithm that falls under this classification. We also study Lemke’s algorithm and the Cottle-Dantzig algorithm, which are classical pivoting algorithms for solving the linear complementarity problem. The reduction of Jurdzinski and Savani from discounted games to LCPs allows these algorithms to be applied to infinite games [JS08]. We show that, when they are applied to games, these algorithms can be viewed as strategy iteration algorithms. We also resolve the question of their running time on these games by providing a family of examples upon which these algorithm take exponential time. Greedy strategy improvement is a natural variation of strategy improvement, and Friedmann has recently shown an exponential lower bound for this algorithm when it is applied to infinite games [Fri09]. However, these lower bounds do not apply for Markov decision processes. We extend Friedmann’s work in order to prove an exponential lower bound for greedy strategy improvement in the MDP setting. We also study variations on strategy improvement for infinite games. We show that there are structures in these games that current strategy improvement algorithms do not take advantage of. We also show that lower bounds given by Friedmann [Fri09], and those that are based on his work [FHZ10], work because they exploit this ignorance. We use our insight to design strategy improvement algorithms that avoid poor performance caused by the structures that these examples use

    Distributed Methods for Computing Approximate Equilibria

    Get PDF
    We present a new, distributed method to compute approximate Nash equilibria in bimatrix games. In contrast to previous approaches that analyze the two payoff matrices at the same time (for example, by solving a single LP that combines the two players payoffs), our algorithm first solves two independent LPs, each of which is derived from one of the two payoff matrices, and then compute approximate Nash equilibria using only limited communication between the players. Our method has several applications for improved bounds for efficient computations of approximate Nash equilibria in bimatrix games. First, it yields a best polynomial-time algorithm for computing \emph{approximate well-supported Nash equilibria (WSNE)}, which guarantees to find a 0.6528-WSNE in polynomial time. Furthermore, since our algorithm solves the two LPs separately, it can be used to improve upon the best known algorithms in the limited communication setting: the algorithm can be implemented to obtain a randomized expected-polynomial-time algorithm that uses poly-logarithmic communication and finds a 0.6528-WSNE. The algorithm can also be carried out to beat the best known bound in the query complexity setting, requiring O(nlogn)O(n \log n) payoff queries to compute a 0.6528-WSNE. Finally, our approach can also be adapted to provide the best known communication efficient algorithm for computing \emph{approximate Nash equilibria}: it uses poly-logarithmic communication to find a 0.382-approximate Nash equilibrium

    Lipschitz Continuity and Approximate Equilibria

    Get PDF
    corecore